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ABSTRACT 


Line Codes which incorporate Error Control capability for 
highspeed fiber optic links, have been studied in this work. 
Balanced codes, which contain equally many l’s and 0's, suitable 
for transmission where dc free pulse formats and low complexity 
encoder-decoder Implementations are required, have been studied. 
A class of block coset codes derived by partitioning linear block 
codes are discussed. 

Balanced codes without error coirection, derived from linear 
block codes are presented Balanced codes with Error Correction 
derived by partitioning a set of balanced words are described. 
Encoding and decoding techniques for such codes are developed. An 
Improvement over the one dimensional codes, to provide burst 
error correction is presented. The Error Correction Capability of 
such array codes is discussed and the encoding and decoding 
algorithms are presented. 

A comparison of the directly coded 8B10B block code with 
that obtained by combining two smaller length block codes is 
discussed. The features like run length. Running Digital Sum etc. 
for these codes are tabulated . The concept of guided scrambling 
and its use in line coding is also studied. 
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CHAPTER - 1 


INTRODUCTION 

In fibre optic communication systems, like in other digital 
transmission systems, the channel often does not pass dc . This 
causes the problem of baseline wander. One way to overcome this 
difficulty Is to restrict the dc content in the signal stream 
using suitably designed codes. As a result, many codes having dc 
constrained property have been studied [ 14- ] The coding 
requirement is defined in terms of constraint on the Running 
Digital Sum (RDS) of the coded signal stream and the efficiency 
of such a dc constrained code is related to RDS in a definite 
way. The question of what is the best possible way to constrain 
the RDS and design a dc constrained code has been addressed by 
several authors and in the process, codes with errot detection 
have been designed. Improvements have been done in terms of 
impel ementatlon and reduced complexity. 

Various mBnB codes are examples of such d.c. constrained 
codes which have found considerable application in fibre optic 
communication links. 

1.1 Features of Line Codes 

Line coding is the process of modifying a source signal to 
facilitate proper signal reception in the presence of 
transmission impairments. In fiber systems, following line code 
features are required. 



C13 Bit. sequence independence : The 1 inecoder must adequately 
encode any source sequence. 

C29 Small low frequency content, s The transmitted signal should 
be balanced to allow for ac coupling in the receiver 
C33 Transmission of adequate timing information : The encoded 

bit stream must contain enough level transitions to allow 
for proper operation of clock, recovery circuitry. 

C 43 High efficiency i To keep noise bandwidth and terminal 
circuitry as low as possible, introduced redundancy should 
be kept at a minimum. 

C5) Low error multiplication : An error which occurs during 
transmission should not result in many decoded errors. 

C6I) Low systematic Jitter : The sequence of transmitted bits 
must ensure that pattern dependent Jitter is kept low. 

Keeping the above properties in mind, work has been done in 
designing mBnB block codes [ it*. ] , which incorporate the above 
£ eaturea . 

i . 2 mBnB B1 ock Codes 

The mBmB codes are fixed length binry block codes. Input 
data is divided into blocks of length m bits and each block is 
coded into a n bit codeword (n > m) . This process increases the 
code rate, but, at the sametime, provides some advantages. The 
efficiency of the mBnB code is m/n For purposes of synchroniza- 
tion, the block duration of m bits is made equal to codeword 
duration n bits. Hence there is an Increase in the code rate. 



equal to n/m. Thetransmltt ed symbol bit duration T = T .m/n, 

n m ' ’ 

where T is the bit duration of m bit uncoded block. 

SSL 

To limit channel data rate higher values of m and n are used 
which lead to higher efficiency codes. Usually, n Is taken to be 
equal to m+1. Two alternate set of codewords with opposite 
disparity are used. This keeps a check on the Digital Sum 
Variation (DSV) of the code sequence. 

The important features of these mBnB codes are : 

(1) As m Increases, the efficiency of the code increases. The 
Increase in data rate and Bandwidth is smaller with larger 
value of m. With a large value of m, the coding complexity 
increases. But, for higher capacity systems, coding costs 
are less compared with total cost 

(2) The mBnB codes exhibit error extension effect. For each bit 
of receiver decision error, whole block is In error and 
hence maximum error spread is m bits per bit of receiver 
error . 

(3) Ulth a large value of m, the number of consecutive similar 
bits (Run length) also increases and hence the code carries 
less timing information. The DSV also increases with 
increasing m. 

(4) Block Coders use lookup tables for purposes of encoding and 
decoding. Thus the value of m has to be chosen in such a way 
that lookup tables can be implemented in an easy manner and 
availability of EPROM’s at the required data rates is not a 


probl em. 



1.3 DmBIM and mBXC Codes 


A new llnecode called "DmBIM" which la suitable for very 
high apeed optical digital tranamiaalon has been developed [ 14 .]. 
This code pocesses a good balance of marks and spaces. In this 
encoding process, the speed of Input signal Is increased by m+l/m 
times and then a mark bit is introduced in the (M+l)th slot. Mark 
inserted signal (Q) la converted to the DmBIM code(S) by the 
equation = S^_ ^ where, and denote the kth signalbit 
of S and Q, respectively. In the decoding process Q is obtained 
from, ® where represents mod-2 addition. The 
original information signal is then recovered through m/m +1 speed 
conversion. 

The (m+l)th bit is complement of the mth bit, which - la 
generated automatically by the encoding procedure. The increase 
in code rate is m+l/m. The Run length of this code is m+1. The 
spectral characteristics of the DmBIM code show that there is no 
discrete component except dc component. As block length m 
Increases, suppression of high frequency and low frequency 
components weakens and the spectrum assumes the shape of a random 
signal . 

Another line code which can be used for high speed 
transmission system, is the mBIC code [ 14 -]. In this coding format, 
an additional bit is Introduced, after every m bits of the input 
stream which is the complement of the mth bit. Thus, the maximum 
run length is m+1. The received mBIH coded bit stream is 
synchronized by the frame pulse. The stream then has Its 



transmission speed converted to the original information speed by 
a speed converter, after errors in the stream have been detected 
by the c bit. Spectral characteristics of the mBIC show that the 
high frequency and low frequency components are suppressed. 

The mBIC codes exhibit the following features : 

(1) They suppress Jitter more effectively when a scrambler 
precedes the coder. 

(2) Error rate chat acterlstlcs in a transmission system which 
suffers from intersymbol interference are improved by use of 
mBIC code. 

An error detection method using c bit checking proved very 
effective in this case. 

1.4 Scrambling o t Data Stream 

Scrambling is another technique which has the capability to 
meet some of features of the line coder mentioned earlier. Self 
synchronizing scrambling is one where, the source sequence is 
multiplied mode 2 by x** and then divided by d(x), where d(x) is 
the scrambling polynomial and d in x is the degree of d(x). By 
properly choosing d(x), scrambled sequences with balance of marks 
and spaces and adequate timing information, can be achieved. The 
bit stream of length j in polynomial version is described by 
P(x) = Cj_jX^ + ...., + C q . If S(n) is a continuous bit stream 
and d(x) is the scrambling polynomial, and d(x) is the quotient 
obtained by dividing S(n) by d(x), the transmitted scrambled 
sequence is tx(x) = Q^^^-fsCx) .x'^'l . Q is the quotient obtained 



by dividing S(x).x** by d(x). If there is an error pattern e(x) 
during transmission, the decoded sequence, u(x) will be. 


u(x) = Q d |rx(x) .d(x)J- 

= Q d |tx(x) + e(x) d(x)j- 
- SCx) + Q d |e(x) .d(x)J 


Th.ua , in the abaence of tranamiaaion errora, accurate recovery of 
the aource bit stream la possible. Scrambling doea not Introduce 
any redundancy. Guided Scrambling [ 4 ] ia another technique which 
takea care of the problems with scrambling and haa the capability 
to produce sequences for tranamiaaion in a line coding milieu. 

1.5 The PF mBCm+iUB Code 


Unlike the m&uB codes, the mBIC and DmBIM codes cannot be 
represented by the FSMmodel as they do nothave a state diagram 
representation having a finite number of states. In these codes, 
the RDS la not constrained and hence is unbounded. Another code 
PF mB(m+l)B (Partially Flipped mBm+lB) code Is an improvement 
over the above two codes. In this code, the code sequences are 
balanced and the RDS is bounded. The input binary sequence la 
grouped into blocks of m bits each. Each word la precoded into 
words m+1 bits long. The (m+l)th bit, is a '1' if there are a 
majority of 0's in the m bits and it is a ’O’ if there are 
majority of l’s lnthe m bits. The m+1 bit word thus formed la 
coded into a codeword for transmission. The bits , i = 1,2,..., 


m+1 are chosen as 



{ B if x URDS(n-l) < 0 

i m + l 

B if D^' x URDS(n-l) > 0 

C _ Li = B _ 
oi+l oi+ 1 


i = 1,2, ... ,in 


B^'s are the bits of the (m+l) bit word before coding. 

URDS(n-l) is the word end Running Digital Sum at the end of (n-1) 
worda . 

la the dlaparlty of the nth (m+l) word before coding. 


After reception, if the majority rule, which waa applied 

before coding, ia aatiaifed, the received blta directly give m bit 

input sequences, if it ia not aatiafied, we have to complement 

the bita to yield m bit input aequencea. The run length for thia 

3m” 1. 

code is 2m+l. The RDS of the code alternates between — j and 

4 

-3m-l 

j — . The code is dc free. The decoding rule of PF mB(ra+l)B code 

leads to an error extension effect. 


In the above discussion we considered only binary codes auch 
as mBnB , mBIC, DmBIli and PF mB(m+l)B, because in optical fiber 
systems, binary transmission is preferred over M-ary systems. Due 
to the signal dependent shot noise in auch systems, decision 
levels would require to be nonuni formly spaced if multi level 
signals are transmitted. Also, decision and regeneration of multi 
level signals is more complicated than those for binary signals. 

1.6 Introduction to Error Conrol Coding 

Uhen we model the communication channel as a Binary 
Symmetric Channel (BSC) or a Binary Erasure Channel (BEC), there 



la a finite probability with which a bit can be erroneously 
received. The decoding failure can occur because of channel 
characteristics like low frequency cutoff, where the received 
bits appear as decaying amplitudes. Eventliough, normally the 
probability of erroneous decoding is low in fiber optic 
transmission, it is advantageous to use Error Control Coding to 
provide reliable communication especially between computers with 
fiber optic links. 

Error Correcting Codes are divided into two broad 
categories, namely block codes and convolutional codes. The block 
codes can be linear or cyclic codes. Linear block codes are one 
in which linear combination of the codewords is also a codeword, 
in cyclic codes, a cyclic shift of the codeword is also a 
codeword. Systematic codes are on in which in every codeword, the 
information part is retained as It is The decoding is easier for 
such codes. 

In the following, we consider block codes of both Error 
detecting and Error Connecting (EC) types. In this, we are mainly 
concerned with the problems of communication. Codes which are 
pertinent to this work and also applicable in areas like magnetic 
recording, have been garnered in a coherent manner. Some modified 
codes with EC capability are also introduced here. 



1.7 Organization of the Thesis 


The thesis is organised as follows : 

In Chapter Two , discussions on Error detection and 
correction capabilities of the codes proposed by various authors 
are included. 

Chapter Three contains discussion on modification of 
existing codes which we have suggested. They are efficient and 
emphasize techniques which are not used in the presently 
available dc constrained code design. 

In Chapter Four, we make a comparison of the directly coded 
8B10B code with the combined 8B10B code. This is followed by a 
discussion on the dc constrained code efficiency. 

Chapter Five introduces the guided scrambling concept. 

Finally, the conclusions are presented in Chapter Six. 


Suggestions for further work are also included here. 



CHAPTER - II 


INTRODUCTION 


In practice, the line coder is modelled as a Finite State 
Machine (FSM). The message sequence is split into blocks, say m 
bits each and each block is coded into n bits. Since the line 
code must satisfy the rules discussed previously, an alternative 
codeword of opposite disparity is provided to each condevord. The 
FSM of an encoder might look as shown below. 



u 

o 

Codewoid 

with 

0 

disparity 

U -1 

Codeword 

with 

-1 

disparity 

%1 

Codeword 

with 

+ 1 

disparity. 


each state of the FSM is determined by one of the all possible 

values of the word end Running Digital Sum (URDS) of the code 

sequence. Suppose one have three states given by 0 , +1 and -1 as 

shown above. If the coder is in state -1 and a codeword u is 

o 

encoded, the coder remains in state -1. If a codword u + ^ is 
encoded, the resulting state is 0 and so on. Thus, the Incoming 
message sequence results in a transition to the same or another 


state. The encoding rule ensures an alternating RDS between two 
opposite levels so that the resultant code sequence Is dc free. 
More details on FSM and its use can be seen In [ Wj. ] 

The size of the FSM model and the analysis can be made 
simpler If there are fewer states and the simplest model is one 
in which there is a single state. If this lone state corresponds 
to a URDS > 0, the code has a constant dc If the state 
corresponds to URDS = 0, the code is dc free If the code has 
constant dc, the level can be restored using dc recovery 
circuits. Balanced codes are more advantageous because they do 
not require the addiional circuitry. Also, balanced codes pt ovide 
powerful error detection. 

The balanced codes proposed by Knuth [ 4. ] and Bose [ z. ] 
serve the purpose In a telecommunication environment. These codes 
are of error detecting type only, but their encoding and decoding 
algorithms are simple. The work on balanced codes was first done 
by Knuth. Ue give some of the important results below of his 
work. 


2. 1 Efficient Balanced Codes 

Let n be the number of message bits and P be the numbet of 

f m 

2 1 be 

the total number of balanced words of length m and weight ( ^ )- 
To provide adequate number of balanced words, we should have 
M(n+P) S: 2 n . Using Stirlings approximtion , the optimum value of n 

P 

is chosen to be 2 . Thus, in the communication environment we 



have a code set with blanaced code words with increase in code 


rate n+P/n = 1 + P/2^^ 1 if P is even and RDS < and URDS = 
±1 if P is odd Thus, if P is odd, we have to chose complementary 
pair of code words from which the transmitted codeword is to be 
chosen such that we confirm to the rules of alternate disparity 
word transmission. 


PARALLEL SCHEME : 


v(w) = total numberof 1 ’ a in binary word w. 

k Cw 

(k) 


v (w> = number o£ 1 ' h In the first k bita of w . 


w 


= word w with first k bits complemented. 


Ckl 

Thus, we have v(w v = v(w) + k-2v k (w). 


.oo. 


If w has length n and let stand for v(w ) the quantity 

o^Cw) changes by ±1 when k increases by 1 . If describes a "random 


walk” from cr (w) = v(w) to a (w ) = n - v(w) . Since 
o n 


[ 5 ] 


lies in 


the closed interval v and n-v for all integer v, there always 

exists a k such that o^Cw) = £ ^ J . In other words every word w 

ran be balanced in the above manner. If we encode k bits in a 

>alanced word u of length P, and if n and P are not both odd, we 

(k) 

•an let w correspond to the balanced codeword uw . 


For example, if wehave an 8 bit information word w, we can 

(k) 

ind a k such that w is balanced, the number of parity bits is 
rbltrarily chosen to be 5. Parallel decoding makes use of a 
ookup table. Thus we have a look up table with eight words each 
ord 5 bits long. The eight words correspond to k = 0,8 and 

, 2 , 3 , 4 ,5 , 6 , 7 . Thus the look up tableis very small in size. 



SERIAL SCHEME t 


This is an improvement over the previous scheme in that 
Imbalance in n is compensated by a corresponding imbalance 

and P = 3 , the following look up table can 


, for n = 8 
lsed . 

v(v) 

0 

1 

2 


U S v(w) 

001 4 3 

011 3 4 

010 4 5 


u s v(w) 

101 3 6 

100 4 7 

000 5 8 


any 
in 
be 

u s 
111 2 
110 3 

001 4 


here s = v£w^ k ^ 
he word uw^'* 


* pr] * 5 

aat cf (w) = s. 


]■ 

will be balanced if and 
. The code is defined by 


only if v 
choosing 


[uw Ck) ] - 
smallest 


v(u) + 
k such 


To decode this scheme, first v(w) is determined from u, then 
le smallest k is found out such that o^(w) = v(w) . Thus the 
nervation in the above schemes is that they are error detecting 
ly because of the minimum distance property, an error is 
tected if v(w) is not tallied by that Implied by u and vice 
rsa. Also since the rate increase is not much significant they 
n be used for low speed transmission systems. The maximum run 
ngth is ^ 5 ^. 


An extension of the work done by Knuth is taken up by Bose [z. 
From his discussion, an Important use of the balanced code is 
lerved. Even if, they do not provide ever correction they can 


detect all unidirectional errors. The necessary properties for 
unidirectional errors detection is discussed later. We can model 

the channel in which only 1 ► 0 or 0 ► 1 transitions take 

place. If we are using one of such channels, we have two 
advantages, such codes provide the desired de characteristics as 
well as detect all unidirectional errors. In the next section the 
method suggested by Bose is described in detail. 

2.2 BALANCED CODES WITHOUT ERROR CORRECTION 


Let k 

— 

number 

of 

information 

bits 


r 

- 

number 

of 

check bits 



ko 

- 

number 

of 

0 ’3 

in the 

information 

part 

kl 

- 

number 

of 

1 ’s 

in the 

Information 

part 


PARALLEL CODING SCHEME : 

If we have 8 bit long information symbols, there can be 
words of weight 0,1,2, ... 8. There is an all zero word and an 
all one wot d . If we complement first 4 bits, we get 11110000 and 
00001111 and the check symbol is 011 for both. 

The check symbols for other information symbols are based on 
the number of 0'a. If ko = 7,6,5 or 4, check symbol Is ko In 
binary. If ko = 3,2, or 1, check symbol is ko-1 in binary. Thus, 


we have 



w(x) 


check 


word 


modified 


check 


1 

111 

00000000 

11110000 

Oil 

2 

110 

11111111 

00001111 

Oil 


3 101 

4 100 

5 010 

6 001 

7 000 

Thus, in general, 

(1) if ko = 0 mod 2 r , Complement 2 r ^ first bits and append 

r “ 1 

2 -1 in binary as check. 

f — 1 r 

(2) if 2 < ko < 2 -1, Append ko in binary as check 

r - 1 

(3) if 1 < ko < 2 -1, Append ko-1 in binary as check. 

Thus we obtain balanced codes, but there is no error correction. 
Exampl e : 

k = 11, r = 3, 7 out of 14 code is constructed, 

ttords of weight imod8 where i = 0,1, 2, 3 can be mapped into words 
of weight 4, 5,, 7 for words of weights J = 4, 5, 6, 7 words can be 
formed of weight in the range J and 11-J. One interesting 
characteristic of this particular code is that the check symbol 
directly represents weight imodB of the information symbol. 

w(x) 0123456789 10 11 

w(F(x) 766565547665 
check 000 001 010 011 100 101 110 111 000 001 010 011 

Decoding : If check symbol value is i, 0 < i < 7, complement the 
first 0,1, 2,... 11 bits until the new word has weight imod8. Find 



-w 


J auch that o-j(x) = imod8 and complement the first J bits of x to 
get original information symbol where 0 < J < 11. 

The general construction method is as follows : 

Let the number of information bits be k = 2 +J where 0 < J < 

2 r -(r+2) for r 5: 2 . 

Convert all information words of weight i and 2 r +i to words of 

weight w for all 1 = 0,1,2,... J and then append r bit check 

fk+rl 

symbol such that w + w(cs) = j— __l . Similarly, words of weight j, 
j = J+l, J+2,..., 2 -1 are mapped into words of weight w^ where 
w^ is in the range [J, K-J] and append r bit check symbol such 
that , 

“l * - [ T 1 ]• 


•• [2] 


m + 

2j " 


i, i = 


can be mapped into words of weight + l and a 




The information symbols with weight other than 

0 , 1 , 2 , 

check symbol of weight J is added auch that 
l + J 

Ue have seen in the above discussion, various methods to 
design balanced codes. The simplicity of the codes lies in the 
fact that encoding and decoding tables are very small compared 
with the standard arrays for decoding EC linear block codes. 
These codes ate only error detecting which is provided by the 
balance of the code words. Since there is very little redundancy 
introduced the increase in code rate is kept low. 


In the next section, we will see the design of dc free codes 
which provide both error detection and correction. 


3.3 DC FREE - COSET CODES 


The motivation for the work on dc constrained codes for 
fiber optic link communiatlon, was the work done by Herro and 
Deng [3]- Interest in the above area has received much attention 
lately. There are some constraints for line coding (discussed 
previously), which have to be incorporated while designing codes. 
Thus error correcting codes which incorporate these features are 
of much interest because only error detecting codes which handle 
the pertinent features have been developed till data. The codes 
designed by Herro and Deng incorporate zero dc spectrum, limited 
run length properties. Line codes which have zero dc spectrum and 
limited run length have been designed previously but, they have 
only error detection capability. 

Not at, ions > 

D = maximum running disparity after any bit position. 

= running disparity at the end of a codeword. 

Usually run length is denoted by (L,L) where i = minimum run 
length -1, L = maximum run length -1; usually in the digital 
transmission system of concern, I =0. 

The dc free coset codes are denoted by (n,k,D) where 

n = codeword length, k = information length, in bits. 

3.3.1 DC Free Coset Codes with Error Detection only 

An n-blt codeword v 1s given by 
v = (o,u) +aln 

where u = (n-1) bit information word 
a = 1 or 0 



±& 


In * n bit all 1 vector. 

The coaet code consists of linear code 

a - o 


T x = |v./v i = 


and its coset 
T 


= ^ v jy v ^ = v i + ^ n ^ ^ or ^ = 


. . , 2 n-1 -l 


where v’ is the complement of v^ . 


The construction of dc free coset code is based on 


'vector 


n 


space partitioning". The linear space of 2 vector is partitioned 


into 2 




n-l 


disjoint subsets 


{ A °' Aj V-i.J 


where A. 


for i = 0,1,..., 2 


n-l 


-1 


Let D^. we the running disparity at the end of a codeword at time 
t I f an (n-l) bit information vector is to be encoded at time t, 
a subset is chosen, and is used to select a codeword in 

V 

DC FREE COSET CODES WITHOUT ERROR CORRECTING CAPABILITY s 


Encoding : Suppose an (n-l) bit information vector n, is to be 
encoded at time t. Let D^._^ be the running disparity at the end 
of a codeword at time t-1. The coding steps are as follows : 

(1) v « (o,u), D^. 4 — ^t-l ; d t dis P arit y k at 

time t. 

(2) if D^.d^ < 0, D.J. 4 — D^+d^ then go to 4; or else go to 3. 

(3) v = v+ln, «— - d t , 

(4) encode next n-l information bits. 



Decoding i 


Let v = (v 1 > v 2 ,..., v ) be the received version o£ v. The 
decoding algorithm is as given below . 

(1) if v, = 0, u = (v,,v,, ,v ) otherwise 

x l J n 

" = 

(2) decode next n bit received word. 

The running disparity at the end of the any codeword is bounded 
by jD’j < n The maximum running disparity at any given bit 
position is given by jD| = n + The worst case occurs when 

the disparity at the end of a codeword is 0, followed by an all 1 
codeword, then followed by a codeword with l’s in its first 


GO 


positions. The maximum run length L = 2n + j^J -1 and the 
worst case occurs when the disparity at the end of a codeword is 
n, followed by two all 0 codewords, then followed by a codeword 


with 


[?] °* 


s in its first 


position . 


2.3.2 DC Free Codes with Error Correcting Capability- 

In this section, the previous ideas are extended to provide 
an improvement in the capability ofthe code. 


The code is defined by (n,k+J) where J is the number of 0’s 
added to the USB of the information word which is then converted 
into a codewords n bits long. This can correct t errors and 
simultaneously detect X or fewer errors provided that 2t+X+l < 
d^ , where d^ is the minimum distance of the code. Though, linear 
block codes can be designed for powet ful error correction, they 
do not have good dc properties. The following code provdes a 
compromise between the two properties. 



The dc free coset code is defined by 


v = u(G 2 G 1 ) + g. 

where u is a k bit information vector, G^ is a kx(k+J) matrix 

called the transfer matrix. The matrix G^ transfers the k bit 

Information vector into k bits of a (k+J) bit vector with the 

J 

first J bits 0’s and g = £ a j = ° r ■*“ 

Thus, g is a linear combination of first J rows of (k+J)xn matrix 

G^ which is the generator matrix. The code word g is used to 

control disparity. The first J rows of G ^ are chosen to satisfy 

supp (g ) n supp (g.) = 4 > for i ;* j and g.+g 0 + ... + g T = In 

supp(g^) is the set of coordinates at which the components of g^ 

are nonzero. Thus gj controls disparity of coordinates supp (Sj) 

and g.+g_+ ...+ g = In guarantees that the disparity at all the 
1 Z fl 

coordinates of v is controlled. 


G, is given by a kx(k+J) matrix of the form 


0 

0 

0 


0 

0 

0 


1^ is kxk identity matrix. 


Thus, we have. 


v = uG^ + g = ( 0 7 , u)G , g = £ a ( g. 


j = l 


J J 


= <* 1*2 


»jU)G 1P 


where 0^ stands for J zeros. Also ~ *k 


The information vector can be retrieved from u = 
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Encoding s 


A k bit information vector can be encoded at time t as 
follows : 


CD v +- uc 2 G lf J «- i. d; «- d ;_ 1 

(2) if D'.d < 0 , D ’ 4 — D’+d. then j « — j + 1 and go to 4; or else 

X 0 X X J 

go to 3. 

(3) v <— v+g jf 4- D’-d., j <- j+1; 

(4) if j < J, then go to 2; else encode next information block. 
Decoding i 

A> 

Let v be the received version of v = (a a ,u)G. 

1 n J 1 

(1) Find (a ,a , a u) based on block code with generator 

l a j 

matrix G^; 

■*" A * 4. 4, 

(2) u = (a r a 2 ,..., a j u) G 2 

(3) decode next received word. 


Properties of the Code : 


There will be an obvious improvement here in the dc 
propertiesof the code became the all zero and are words canalso 
be modified. Thus, we have, ID’ I ^ max w where w is the 

it j j 

Hamming weight of g^ . It is stated using analogous reasoning, 


that 


i D i - w [^= 5 =] 


and L ^ 2w + 
max 


f W max "j 

L 1 2 3 J ' 


max 


w , 


where w 


max 



2 - 2 . 


Also a bound on the number J la derived a a 



Further details of these codes were provided by the 
Popplewell and Reilly [8 ] - 


2. 4. ERRROR CORRECTING CIS, 8j BALANCED CODE 

In the balanced codes we have seen that, since the minimum 
distance is 2, we do not have an error correction. Now, an 
improvement over those codes is to increase the minimum distance 
and at the same time provide run length constraints. Uork in this 
area has been done by Ferriera [4- } and an improvement has been 
proposed by Balum [s]. In the following section, we provide some 
details of both the codes designed by them. Ferriera proposed a 
(16,8) d ^ = 4 dc free block code. Uith run length maximum = 8 
and maximum RDS = 5. This is particularly useful because memory 
elements with 16 or 8 address lines are readily available. 


The 16 bit codeword is represented as follows : 

and let 2-tuples 

(n. , n. ) , n = a, b , c or d in ~cv contain only elements of A and 

X u 


cw = a 1 a 2 bjb 2 c^ d^ 


the other two contain only elements of B where 

A « | 00, 11 | and B « J 01, 10 J 

Let w ("cw) be the height of element ot in cw. 
ot 

Let w ("cw) = w_ . (cw) In order for cw to be dc free and also let 
oo' ' 11 

w 01 (cw) = v 10 (cw) be even. 
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Since two o f the four 2-tuples in cw must contain only 
elements from A and theother to only elements from B , there are 


= 6 different ways in which form 2-tuples can be chosen from 
A and B. Sin v oo ('cw) = w^Ccw), there are = 6 ways m which 
these symbols can be chosen from A. Since Wq^("cw) and w^Cciw) are 


always even, there are 


{CH3-C)} ■ 


= 8 ways m which the form 


symbols of cw which are elements of B can be chosen from B. Thus, 

8 

the total number of different codewords are 6.6.8=288 > 256 = 2 . 
Thus a code with k = 8 can be formed. 


From the set of 288 codwords, 256 are selected which are all 

balanced. The maximum run length can occur when a code ending 

with C^d^d^ = 1^0000 is followed by a word starting with a^a^b^ = 

000001; hence 1=9. The maximum value of RDS is 5 which occurs 

when a^a 2 b^ = 000001. 

By omitting 32 out of 288, words, the run length can be 
reduced to 8, hence (b,l,c) = (0 7 5). 

ENCODING AND DECODING : 

The 256 different information words can be mapped onto the 
codwords by means of a 256x16 bit ROM. This completes encoding. 

DECODING s 

Since d , = 4, a single bit error can be corrected, 

min 

Decoding is done by Implementing a 64Kx8 bit look up table for 


ROM's. 
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(1) Examine the four 2 tuples (x^x^ in the received word. 
Detect the 2 tuple in error since it does not contain only 
elements of A and B. 

(2) Determine the set A or B from which the elements of the 2 
tuple in error were chosen. If there are two other 2 tuples 
with only elements of B and one other 2-tuple with only 
elements of A, with 2-tuple in error should contain only 
elements of A. 

(3) Locate the symbol in error, since It is the symbol in the 
2-tuple which is not an element of the set A or B as deter- 
mined in ( 2 ) . 

(4) If the symbol in error should be an element of A, Invert one 

of the two bits of the symbol such that dc balance of the 

received word is restored. 

(5) If the symbol in error should be an element of B, invert one 

of the two symbol bits such that w^(cw) and w^^("cw) is 

even. 

(6) Map the corrected received word onto the corresponding 
information word by means of 256x8 bit PLA with 16 input 
lines 

Ue see that the Increase in code rate is 2 and a cot respond- 
ing increase in Bandwidth of similar magnitude. The run length is 
8 which is fairly large. Another (16,9,6,5,4) code is proposed by 
Blaum [5] in which the increase In code rate is reduced from 2 to 
1.83. This is also a single error correcting code since ^ m £ n = ^- 
In the following section, the code proposed by Blaum is discussed 


in detail. 



2.5 DC FREE ERROR CORRECTING BALANCED CODE 


The code is defined by the parameters (2n,k,l ,c,d) . 2n is 

the length of the codeword, each of weight n. In other words, the 

codewords are balanced words, k is the information length, l is 

the run length; c is the RDS, and d Is d , . Concatenation of an 

min 

error correcting code and a dc free code can be of great use in a 
fibre optic channel where the codeword takes care of some of the 
properties pertinent to such a channel. Associate symbol 0 with 
disparity f(0) = -1 and with f(l) = +1. 

Construction of (16,9, 6,5,4) code : 

Consider each codeword as formed by eight 2-tuples. Fill 
each 2-tuple with 10 or 01 in such a way that the number of 10’s 
is even. This Is referred to as type I code, and there are 128 
such codewords. 

Next consider each codeword as formed by four blocks of four 
bits each. Choose any two blocks and place any two 4-tuples of 
weight 1 Take the first of the remaining two 4-tuples and place 
a 4-tuple of weight three. Take the first block of weight 1. See 
how many time it has to be rotated to the right to make it 
complement of 4-tuple of weight three. Place a 4-tuple of weight 
3 which is the complement of the 4-tuple-l obtained by rotating 
the second 4-tuple-l that many times. 

There are 4^ = 384 such codewords. These are type II 

codewords. Thus the total number of codewords are 128+384 — 812. 



Thus the message bits are 9. The maximum run length is 6. 
maximum RDS is J 5 | . 


The 


ENCODING AND DECODING * 

Let the 9 bit information vector be a^ag, ••••» 

If a. a, = 00, it will be encoded as type I. 

1 2 9 

a . _ = T a.mod2 and the codeword is 

.1*3. 1 

C *" A A & a A rn ^ a • » • 

3 3 4 4 


a 10 a 10 


If a ^ a 2 * it is coded by type II. 

Ue have six possible choices for a. a 0 a 0 

The two blocks in which we place 4 tuples of weight 1 are chosen 
as shown below : 


a l a 2 a 3 
0 10 
Oil 
10 0 
10 1 
110 
111 


AB 

AC 

AD 

BC 

BD 

CD 


a„a r will determine 4-tuples in first block and a,a„ will 
4 5 6 7 

determine 4-tuples in second block. 


00 — *• 1000 

01 — ♦ 0100 

10 — ► 0010 

11 — * 0001 


agSp can be viewed as binary representation of lnterger J where 
0 < J < 3. Consider the complement of first 4-tuples and rotate J 
times to the right. This gives first 4-tuple of weight 3. The 
same is done to second 4-tuple of weight 1 to get second 


4-tuple-3 . 
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Let c (a) be the received word. Assume that no more than one 
error has occurred. The first step is to determine whether the 
word is of type I or type II. 

If the blocks A, B, C, D in c(a) are having weight 1 or 3 in 
majority, it is of type II, or it is of type I. 


Assume c(a) is of 

type 

i, if 

an 

error 

occurs , 

one of 

the 

2-tuple is either 00 or 

11 . 

Consider 

the 

remaining 

2-tuple 

and 

count how many times 10 

appears . 

if 

this 

is even 

, the 

error 


2-tuple is 01; if odd the error 2-tuple is 10. 

Assume C(a) is of type II. 

Since there la one error, one block out of 4 will have even 
weight . 

Let the four bit blocks and in C(a) have the same 

weight. Let G^ and be the remaining two blocks. Let precede 
and G^ precede - 

Let p T (K) be the rotation of block K, J times to the right. We 
J 

obtain a from C(a) by applying following algorithm : 

(1) determine J such that pj(H^) = or pjfl^) = G^ , 0 J ^ 3 . 

(2) In C(a) replace by pj(H^) and G£ by p^fH^). The vector 
obtained is the correct codeword C(a) . 

(3) Obtain a from C(a). 

In this chapter we have seen the design of balanced codes 
followed by that of single error connecting balanced codes. The 
EC balanced codes are very useful because the code lennth Is 16 
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and hence, memories which are usually organised in bytes can be 
efficiently used in size for lookup tables. Also, the EC balanced 
codes are more efficient than the EC dc free coset codes [3 ], 
because the increase in code rate is leaser in the case of 
balanced codes. 

In the next chapter, we see a different approach for 
designing the dc constrained codes and we propose the codes with 
error correction followed by codes with both error detection and 
error correction. These codes are efficient. 



CHAPTER - 3 


MODIFIED DC FREE CODES 


As already mentioned, line coding haa been defined aa a 
technique where the code incorporatea features like (1) adequate 
timing information (2) disparity control (3) zero power spectral 
density at dc (4) balance of l'a and O’a in a codeword, etc. Line 
coding does not incorporate error correction, which may be 
desirable for fibre optic channels under consideration. Using the 
features of scrambling and linear Error Correcting Codes, we 
introduce a coding scheme which Incorporates line code features. 

3.1 DC CONSTRAINED CODES - WITH ERROR DETECTION ONLY 

Ue use a linear (7,4) block code which has a minimum 
distance 3. 


The parity check matrix H 


The generation matrix G 


*10 0 10 11 ' 
0 10 1110 
0 0 10 111 

* 1 1 0 1 0 0 0 ' 
0 110 10 0 
1110 0 10 
1 0 1 0 0 0 1 


The codeword table is given below : 


Message 


0000 

0001 

0010 

0011 

0100 

0101 

0110 

0111 

1000 

1001 


Codeword 


0000000 

1010001 

1110010 

0100011 

1100100 

0110101 

0010110 

1000111 

0111000 

1101001 


Disparity 

-7 
-1 
+ 1 
-1 
-1 
+ 1 
-1 
+ 1 
-1 
+ 1 



1010 

1001010 

-1 

1011 

0011011 

+1 

1100 

1011100 

+1 

1101 

0001101 

-1 

1110 

0101110 

+1 

1111 

1111111 

+ 7 


Out of the sixteen codewods, fourteen can be divided 
two blocks of disparities +1 and -1 respectively. Also the 
blocks are self complementary Thus one part of the enco 
process is achieved. The all zero and all one codewords are 
complementary, but not included, they cause a large run len 
Thus these two are to be modified. 


The general technique of generating a self complemen 
code [ 13 ] is discussed bleow. 

Let the parity check matrix 


H = 


r-1 


where P = |h. , h h. I 

i [ l , o i,l i » n-lj 


n-1 


If 


2 h. T = 0, where £| implies a mod2 addition, 
l , J 


J—o 

Then it is called even weight row code. 
Let a codeword 


Let H 


-H 


W = {V V P 1 ’ ‘ ‘ ‘ ’ P r-l} 


H ’ is the information part 
rxr 


I is parity check part 
rxr 


each parity bit can be obtained as 

p i = (v d ! Vi) [ U J T 


0 < i < r-1 



Let the component of w = w = 


■ [v 5 i 


■ * • • » ^ ■ c Q i • • • i c 


r-1 


The following holds for odd weight of every row vector h^ 


d o ’ d l d k 


- 1 ) [“if* p 'i • 


This implies 


■ 6 H - 


where D = (d Q , d^,..., d^_^) and hence the code is 

complementary. 


The all zero codeword is unavoidable because this J 
linear block code. Ue would like to convert all-one and all- 
codewords into other words, which suit our requirements. For 

purpose, we can invoke the technique of scrambling by using 

2 

scrambling polynomial d(x) = x +1 which provides the hie 
transition density. 


Thus, wherever all zero or all one words occur, the 
significant bit is complemented and the modified word 
scrambled. The process is shown below. 

All zero word 0000000 becomes 1000000 which in polynomial for 
x^ . Thus the scrambled word is 


S(n) = 


6 d 
x . x 

d(x) 


6 2 
x ,x 

x 2 + l 


6 4 2 

The quotient of division is x +x +x +1. 


All one word 1111111 becomes 0111111 which in polyno 
5 4 3 2 

form 1s x 3 +x>x 5 +x 4 +x+l. Thus the scrambled word is 
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s (x) 


.5 4 3 2 2 

(x + X +x +x +x+l ) ,x 

x 2 + l 


5 A 

The quotient o£ division is x +x +x+l which is 0110011 


There are in all 


(I] ■ 0 


35 words each of weight 4 and 


weight 3. But an exhaustive search shows that there is no other 7 
bit word of weight 3 which is at a distance of at least 2 from 
the seven bit codewords of weight 4. Similarly, there is no other 
7 bit word of weight 4, which is at distance of at least 2 from 


the seven 7 bit word of weight three. 


Hence we choose the scrambled versions of all zero and all 
one words viz. 1010101 and 0110011 for transmission. Thus, we 
have 8words of disparity -1 and eight words of disparity +1. The 
next step is to develop and encoding table with two parts of 
opposite disparities. For this purpose, we add to the most 
significant bit of each codeword, a ’0’ or a ' 1*. The reason why 
we have done this is that we would like to constrain the word end 
RDS among the values 0, ±2. If this is not done and when there is 
need to, transmit a few -1 or +1 disparity words in a sequence, 
the RDS may go without bound. By doing the above modification, we 
have a choice to transmit a word from 0 or ±2 disparity words and 
hence the RDS also varies from -2 and +2 rendering the code dc 
free. The modified code table tooks as shown below. 



Disparity 


Disparity Codeword 


Message 


Codeword 


0000 

0 

01010101 

+ 2 

11010101 

0001 

-2 

01010001 

0 

11010001 

0010 

0 

01110010 

+ 2 

11110010 

0011 

-2 

00100011 

0 

10100011 

0100 

-2 

01100100 

0 

11100100 

0101 

0 

00110101 

+ 2 

10110101 

0110 

-2 

00100110 

0 

10010110 

0111 

0 

01000111 

+ 2 

11000111 

1001 

0 

01101001 

+ 2 

11101001 

1010 

-2 

01001010 

0 

11001010 

1011 

0 

00011011 

+ 2 

10011011 

1100 

0 

01011100 

+ 2 

11011100 

1101 

-2 

00001101 

0 

10001101 

1110 

-2 

00101110 

+ 2 

10101110 

1111 

0 

00110011 

+ 2 

10110011 


The code characteristics are IRDSI = 4, 

1 1 max 

Maximum Run. length = 6 
Increase in code rate = 2 


The encoder and decoder are as shown in Fig. (3*-) . The above 
code does not provide any error construction. It can only detect 
errors. The complexity of the system is fairly high. 

It can be observed in the encoding table that each message 
word has a balanced word. An improvement from the above code 

would be to transmit only balanced words. In which case in the 

Finite State Machine representation of the code there is only one 
state and the code is absolutely dc free because the state 
represents zero disparity. 

The codewords are balanced as described below : 
whenever the disparity is ~1, a '1' is added to MSB 

whenever the disparity is +1, a *0’ is added to MSB 







The scrambled sequences are made 1010101 and 0110011 for all zero 
and all one message words, respectively. Upon reception, the MSB 
is discarded and rest of the word is decoded. 


The encoder and decoder structure are as shown in Fig. 


3(a). 


Now, we describe a scheme for generating the balanced words. 


For the (7,4) code, the parity check was given by 


H 


1001011 ‘ 
0101110 
0010111 


Maintaining the characteristics of 


we add another row and column to the H 


a self complementary code 
matrix. The new parity 


check matrix is given by 


H’ 


10001101 

0 
0 
0 


i a o i o i i 
0 10 1110 
0 0 10 111 


3 represents the parity check matrix for (7,4) code. 


Thus the generator matrix is given by 

' 11101000 
10110100 
G " 01110010 
11010001 

The minimum distance of this code increases by 1 and hence it 
is 4 

The code table is given below : 





fi-BCl £l/£ 



3 Co.) 
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Codeword 


message 

00000000 

11010001 

01110010 

10100011 

01100101 

11000110 

00010111 

11101000 

00111001 

01001011 

01011100 

10001101 

00101110 

11111111 


Disparity 


•8 

0 


* 

0 

+ 8 


2 

Using the same scrambling polynomial d(x) = x +1 
all-one words are transformed into 1Q1Q101Q 
respectively. The encoder and decoder structures 


all-zero and 
and 01100110 
are shown in 


Fig. 3(b). 


There is an improvement in the code characteristics 


The maximum Run length 

I RDS I 
* 1 max . 

The minimum distance d 


min 


6 

3 

2 


3.2 ERROR CORRECTING DC CONSTRAINED CODES 


It has been observed that if all possible constant weight 
words are used in designing a code, the minimum distance is 2 and 
hence error correction is not provided. If we want to increase 
the minimum distance, we have to omit some of the words. 

Ue design a (6,2) d ^ = 4 constant weight balanced code as 

described below. 

Ue have = 20 weight 3, length 6 words. 

These 20 words can be divided into six classes. Two classes are 
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o£ size four and four classes are of size three. Inside each 
class, the minimum distance is at least four. 

Thus, we pick up a class of size four and design a (6,2) 
systematic code. 

Ue start with an ordinary (6,2) linear block code . Let the 
generation matrix 


G 

Thus the code set 


10 10 111 
0 1 0 1 1 1 J 


C = 



To eliminate the all zero word, we make 

c 3 = C l’ C 4 = C 2’ C S C l +C 2’ C 6 = C 5 

Thus we have a (6,2) balance, weight 3, d . =4 code. The code 

mm 

set is 

0 0 110 1 
011010 
L 10 0 110 
1 1 0 0 0 1 

fi "** ic 1c 

The encoding table has 2 rows and 2 columns where k is 
the message length and n is the code length. As we run down the 
table we encounterthe erroneous versions of the actual codewords 
which are at the top of the table The erroneous versions can 
provide characteristics likelimited Run length and balance. 
Hence, these words are suitable for transmission and this is what 
is done above, a specific coset is chosen and the error control 
capability is tested. 

For the chosen generator matrix, the parity check matrix is 
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'10 10 0 0 
tt _ 0 10 10 0 

1 1 0 0 1 0 
1 1 0 0 0 1 

Thut» the syndrome bits are given by 


a l " C l + C 3’ a 2 = c ? + c i>> 3 * = c i + c, + c c . a A = c,+c,+c 


2 4 


1 2 5’ 


1 2 6 


and a - ( s i s 2 3 3 a 4 ) I s the syndrome. 


Since we have modified the c c c c and c, bits, the syndrome 


equations also change. Hence, 

3 } =c i +c 3 +1 ’ 3 ? = S + C i + 1 - = c„+c„+c 


2 4 


1 w 2 '‘5’ 


3 4 = C l +C 2 +C 6 + 1 


The addition is mod-2. 


and the new syndrome s’ = (s’s’s’s’) 

X 4 C) 4 

Since d = 4, single error correction is provided. The code is 

dc free and the maximum Run length is 3 The maximum value of RDS 

is 2 and hence the DSV is 4. 

The decoding can be done as shown below 
s’ s’ s’ s’ bit in error 

X u J 4 

0 0 0 0 none 


0 0 


0 0 


0 0 
0 0 
0 0 


Multiple 


errors 


0 0 


1 


1 


1 


0 

1 





3. 2* i Condi ti ons for* Unidirectional Error Correction 

One of the advantages of constant weight code is it can 
detect all unidir ect ional errors [ ] as the theorems explain 

below . 

Theorem « 

Let x and y be two codewords in c . Let c ^ be set of 
codewords in c of constant weight j. 

If x C . , , and y <e C , then, 

l+k i 

(a) if k=0, N(x,y) > t+1, N(y,x) > t+1 

(b) if 1 < k < 2t , N(x , y) > t + £ J , N(y,x) = t " [ ^~2T ] 

(c) of 1 ^ 2t+l , N(x,y) > 2t+l 

N(x,y) ia the number of 1 — + 0 cross overs from x to y. 


Proof : 


(a) 


(to) 


if k = 0, x and y belong to and the cross overs occur 
pairwise and N(x,y) = N(y,x) 

d(x, y) = N(x.y) + N(y,x) > 2t + l 
N(x,y) = N(y , x) > t+1 

if k > 0, x and y are in different sets C 1+k and C^. The 
number of l’a in x are exactly k more than that in y. There 
must be at least k cross overs from x to y. 

i.e. N(x,y) ^ k. The remaining (n-k) bits of x and (n-k) 
bits of y have same number of l’a. Thus cross overs in the 
remaining part occur pairwise. 

N(x,y) = k + N(y,x) 

D(x,y) = N(x,y) + N(y,x) £ 2t+l 
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N(y , x) > t - £ J and N(x, y) > t + £ — ^ J . 

(c) N(x,y) = k + N(y,x) > 2t+l 

The necessary and sufficient condition for a code to detect 
all unidirectional errors is 

N(x,y) > t+1, N(y,x) £ t+1 V x,y « C. 

For a balanced code which can correct t errors, this rule is 
satisfied . 

3. 2 . 2 Properties of - Product, Codes 

Coset codes are also called Run length limited (RLL) codes. 
To Improve the correction capability o£ the code, pt odut-t code 
approach has been devised. 

If and are two linear codes with parameters (n^,k^) 

and th* 511 the product code C has parameters (n^n 2 ,k^k 2 ^ 

and minimum distance d.d. where d, and d„ are d of C. and C_ 

12 1 2 min 1 2 

k 2 blocks of length k^ are encoded with C^. From the k 2 xn^ array 
thus formed, each column is encoded Into codeword In C 2 - Thus we 
have n xn. array which is the codeword of C. Each row in a 

£k X 

codeword in and each column is a codeword in C 2 - The resulting 
array is transmitted column or row wise. 

Property 1 : 

d l d 2 _1 

The product code can correct errors upto 2 and also any 

d 2 _ ^ 

error burst of length b - maxCn^tj » ^*2 = 2 ’ ^1 ~ 

d l-l 



Property 2 s 


If \> 1 and b 2 are the burst error correcting capabilities of 
and C 2 , then product code can correct at least all error 
bursts of length b = maxCn^, n^). Usually C ± is chosen to be 
a Run Length Limited (RLL) code and C 2 is any Error Correcting 
code and transmission is done row wise Such codes can be 
systematic or non-systematic and they introduce considerable 
redundancy. 

Ue Introduce a new method of designing such array codes. 
The codes here are systematic. 

3. 3. 3 Product Code approach to Construct, DC Constrained Codes 

lie choose C 2 to be a systematic RLL code and need not be 
an error correcting code. The systematic RLL code design has been 
discussed previously. The code array is as shown below. The 
information bits are arranged in the first k^ rows and 
columns. The k. rows are encoded as codewords in C_ . The 
resulting n 2 columns are encoded as codewords in . If and C 2 
are both linear, it does not matter which is encoded first. 



Since redundancy is to be kept as low as possible, for the 
column codes we choose two parity bits, one complementary to 
other, to provide balance and the parity bits indicate even and 
odd parity respectively. Thus, the row codes are RLL codes and 



The code array looks as shown below 

*- k 2 * 

I 


I 1 


0 

0 

10 ■ 

1 

0 

11 

1 

1 

10 

0 

1 

11 

1 

0 

1 ° . 


Any two codewords must differ in at least one position (at least 
one information bit). Let this be in the ith row of the two 
codewords, where, 1 < i < ^ The minimum distance for this array 
code is d f , where d^ is the minimum distance of the row code. 


The difference between these codes and product codes are: 

(1) The bottom n^k.^ rows need not be codewords in C 

2 

(2) The codewords that result from encoding columns first and 
rows second and vice versa, need not be same. 

(3) Column is a fixed code, whereas row code depends on the 
error control capability required. 

(4) The transmission is only row wise because rows are RLL. 

Now we discuss the decoding procedure of the above mentioned 

code . 

Let the code under consideration be a single ECC H 

(a) Check the first k^ rows for errors. Let number of rows in 
error be r^ 

(b) Check all the n. columns for errors. Let the numberof 
columns in error be r^- 

(c) If r =0 and r„ = 0 or 1, there is no error in the 

1 2 


information part. 
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(d) If r 1 = r 2 = I * then there la an error in the intersection of 
row and column and can be corrected by complementing that 
bit. 

( e ) ^f r 2 are greater than 1, there are multiple errors 

and they cannot be decoded (corrected) and hence error is 
detected. 

Multiple Error Correction s 

(a) Check first k.^ rows for errors. Let the number of rows in 
error be r^ . 

(b) Check all columns for errors. Let number of columns m error 
be r 2 . 

(c) If r^ > t and r^ > t, there are more than t errors and hence 
they cannot be corrected. (C 2 is assumed to be t error 
correcting code.) 


(d) If r 1 < t 

and 

< 

t 

errors 

are corrected 

as 

follows . 

Consider 

r i C 2 P 013 ^* 0118 

at intersection of r ^ 

rows 

and r ^ 

columns . 

Correct 

the 

errors by 

complementing 

the 

bits in 


these positions. 

This code can correct k^ errors. If b 2 is the burst error 
correcting capability of C 2 , then it can correct k^ burst of 
errors . 

The decoding of the array code can be more easily understood 
by the following example. Let the code matrix formed be 





1 0 0 I 1 0 


0 0 1 | 1 1 
I 

110 10 0 

Let, r^ and r^ be two each after decoding; “ r 2 = ^ ' * n suc ^ a 
caae, the box shown, represents the region in which the errors 
lie. Let us assume a single error correcting code. If there are 
more than one error in the two rows under consideration, they 
cannot be corrected. In such a case one arbitrary bit from each 
row in thebox is complemented (both made either 0 or 1) and 
decoding rule is applied, similarly other bits are alsotested for 
errror correction. Correction if possible is made, otherwise 
multiple errors are detected. 


0 

1 

o 1 

1 

0 

1 

1 

1 

1 

1 1 

0 

0 


The row codes are to be chosen of even length and with 
balance of l’s and 0’s errors in the columns are found 
by comparing parity with that Indicated by the parity bit. The 
bottom (n -k 1 ) rows are not error correcting codewords in C £ . 
Hence, decoding of these rows is not done. 


The decoding algorithm is simple with the added advantage 
that the coda la ayatematic. Tha coda la dc free bacaua. It la 
balanced Tha Incraaaa In coda rata la Tha maximum ran 
length la n 2 and hence maximum MS la hj-l. 


In this section. 


we have seen the design of balanced codes, 
detecting, types. Proceeding 


both error correcting and only error 



in this tenor, we now give a technique for selecting the code 
length (n) and message length (k) for such dc constrained codes. 

Let C ^ be the set whose elements are codewords each with 

weight i For any. Error Correcting Linear Block Code, if C and 

’ o 

C n exiat ’ C i' C 2' ' C 2t and C n-1 ’ C n-2 ' ’ * * ' C n -2t do not exist - 

This is because, they do not satisfy the minimum distance rule. 

Thus, if we make 2t+l * n-2t-l, which Implies, C 2t + 1 * C n 2t , 

there exist C q , C 2t + 1 and only. So, if 2t + l * n-2t-l, n ^ 4t + 2 

, . n-2 

and t - — j — . The codewords are C , C and elements of C The 

a on n/2 

codewords of are balanced and only unbalanced words are the 

all-zero and all-one words. The minimum distance for such a code 

is d i n/2. For any linear block code, d . < n-k+1 

mm rain 

^ < n-k+1 and k < ^ + 1 
' n * 4t+2, k < | + 1. 

The (6,2) balanced code designed previously satisfies this rule. 
There, all zero word la converted to veight-3 word using suitable 
trana formation and all one word does not exist. Ue assume n to be 
even, all-one word will exist if there are odd number of I’s in 
each column of the generator matrix. 
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COKPARISON OF DIRECT AND COMBINED BBiOB CODE 
A. 1 DESIGN OF A COMBINED 8B10B CODE USING 3B4B AND 5B6B CODES 

A new type of dc balanced, partitioned 8B10B transmission 
code has been proposed by Franaszek [gj. The beauty of this code 
lies in the fact tht the table look up is very small in size and 
the implementation is rather easy Ue discuss the Important 
features of this code in the following. 

Modern communication networks transmit Information in the 
form of bytes with a defined field structure for address, 
information and error control The number of bits in each of 
these is found useful and it can be implemented more easily by 
suitable combining 3B/4B and 5B/6B codes. 

The complexity of a code is defined by the number of 
information bits that must be examined by a coder in choosing a 
codeword. The various coding alternatives for the above mentioned 
code are as follows : 

(a) The block code has parameters d = 0, k = 4, and v = 6 with 

rate 4/5 , where 

d = minimum run length - 1 
k = maximum run length - 1 
v = digital error variation. 

Error propagation is limited to five bits 



(t>) A standard block code with no look ahead, with parameters 
d-Q, k - 3 , v = 5 , a = 8 . w = 10 and m = 1 
a = number of information bits 
w = number of code bits 

m = number of s bit groups that must be considered 
while choosing a codeword 
Error propagation is in general 8 bits 
(c) A code with length 5 with k = 4, v = 5, s = 4 and m = 2 

error propagation is five bits. However, there is little 
redundancy available here for suitable mapping of character. 

Coda Etesign s 

Each in coming byte is partitioned into two sub-blocks. The 
first 5 bits are encoded into 6 bits and the next 3 bits are 
encoded into 4 bits. Thus the combined word has 10 bits. For the 
sub-blocks, the permitted disparity i3 0, +2, -2. The coding 

rules require that the polarity of nonzero disparity blocks 
alternates No distinction is made between 6B and 4B sub-blocks. 
Nonzero disparity code points are arranged in complementary pairs 
to a single source data point. The encoding functions generate 
one of them, if it violates the alternating polarity rule, the 
complete sub-block is inverted. Determination of disparity and 
polarity in the 6B encoder is followed by corresponding 
operations on the 4B encoder, then the running disparity 
parameter is passed along for encoding of next byte. 

The 3B4B and 5B6B encoding tables are as shown. 



Table 3 5B/6B Encoding. 


Table 4 3B/4B Encoding. 
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Evaluation of SBIOB code s 


The maximum DSV for this code at arbitrary points is 6. The 
maximum DSV between sub-block boundaries is 2. All the 6B and 4B 
sub-blocks and the complete 10 bit characters have a disparity of 
either 0 or ±2. Each valid character in the LOB alphabet either 
has five l’s and five 0’s or 6 l’s and 4 0’s or 6 0’s and 4 l’s. 
The maximum run length of this code sequences is 5. 

Error detection i 

One method for error detection is checking the disparity of 
the received word. Nonzero disparity blocks must have alternate 
polarity Some conditions as shown below can be attributed to 
errors . 

a = b = c = d 
f * & = h = j 
e=i = f = g = h 
where abcdeifghj is the codeword. 

The simplest error patterns which may escape detection by the 
code are a single erroneous 1 complemented by a single erroneous 
0. Such complementary errors when confined to a single word, may 
simply change it into another valid codeword. Thus it is 
possible that a complementary pair of digit errors can change 
disparity of the sub-blocks in conformance with alternating 
polarity rule, such that errors are not detectable by the code. 
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4..Z COMPARISON OF DIRECT 8B10B CODE WITH COMBINED 8B10B CODE 

K close observation o f the continued 8B10B code designed by 
Franaszek [6] shows that they had used 128 balanced and 128, ±2 
disparity words of length 10 bits. In the following, we present 
an alternative 8B/10B code by direct transformation which makes 
use of 226 balanced words. For simpli f icationin forming the code 
table, we divide the 10 bit block into 4 bits and 6 bits 
sub-blocks. Codewords which have a repetitive pattern have been 
left out because they induce a pattern dependent Jitter. The 
codeword is abcdefghij. 


Ue have 118 balanced words from set I, 52 balanced words 
from set II and 56 balanced words from set III. 


Set I 


Set II 


Set III 


No. of 


code- 

words 

abed 

efghl j 

20 

0011 

101010 

19 

0101 

110001 

20 

0110 

100011 

20 

1001 

110010 

19 

1010 

100110 


110100 

100101 

010110 

001011 

101001 

011001 

011010 

001110 

101100 

001101 

Q1Q1Q1 

011100 

011100 

111000 

000111 


No . o £ 
code- 
words 

abed 

efghl j 

13 

1011 

001100 

13 

1101 

010100 

13 

1110 

001001 

13 

0111 

010001 


001010 

011000 

101000 

100100 

010010 

000110 

000101 

100010 

100001 


No . of 


code- 

words 

abed 

ef&hi j 

14 

0100 

110011 

14 

1000 

101011 

14 

0010 

110110 

14 

0001 

101110 


110101 

loom 

010111 

011011 

101101 

111001 

011101 

011110 

111100 
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This code has a minimum distance 2. 
The codewords of disparity -2 are 
No. of codewords 


13 

1010 


13 

1001 

Combine 6 bit set of II 

4 

1100 



Total = 30 

The codewords of disparity +2 are the complement of above 30 
codewords . 

Thus we have 226+30 = 256 codewords. 

The transmission rule requires that the polarity ofdispanty of 
adjacent codewords must alternate. In the code table, first 225 
input 8 bit words will be assigned balanced words, and rest of 
the 30 input 8 bit words will be assigned complementary ±2 
disparity words. The storage space required is 256x10 bits. For 
decoding, we can use a EPROM having a suitable size. Now we 
compare the properties of this code with that proposed by 
Franaszek . 

(a.') The maximum run length here = 6 . In Franaszek code it was 5 

fbl IRDSI = 3 Same as that in that code. , 

v ' I ' max , ^ ! 

(c) DVS = 6 Same as that in that code. ^ ^ 

(d) URDS = 0, ±2. Same as that in that code. ^ 

This Is also a DC free code. 

Thus there maynot be much advantage in designing 8B/10B code by 
combining 5B6B and 3B4B codes, as Implementation difficulties are 
expected to be siiall&r in. nature- 
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Since we had discussed in detail the dc contained codes itis 
of natural interest to discuss the efficiency of such codes. 
Work in this area has been done by Chien [ I o ] - We discuss the 
topic in detail below. 


4.3 EFFICIENCY OF DC CONSTRAINED CODES 


Let H, a positive interger be the desired bound on coded 
binary signal stream RDS. This defines a subset s^(ot) of all 
infinite binary sequences An infinite sequence is in the subset 
S («) if the RDS of the sequence is nowhere larger than H and 
less than -M. For example 

k 

I £ a. | S M for k = 1,2,.... 
i = l 1 

A sequence m S^(ot) is an allowable sequece. Denote by N^Cot) and 
N(oO the number of sequences in S^(ot) and S(«) respectively. 
Assuming equally probable symbols, the average information per 
symbol for th sequences in S^Cot) is log^N^Coi) . 


The efficiency of dc constrained code n 


log 2 N M («) 
log N(oT) 


The next step is to find the number of sequences in N^fot). 
Let the sequence length be L. Define an occupancy vector u L where 

U L = { u -ir ’ V u +m}‘ u k 13 the number of allowable 

sequences of length L with URDS = k. 

+M 

The total number of sequences = N„(L) = E u fc 

k=-M 
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As L — ► <», N n (L) -*• N h («) 
L, N(L) = 2 L = NO). So yj 

seen in [ ] . 


and total number 


= lin 
L-*<b 


lo « 2 N H ( - ot ' ) 

L 


of sequences of 
Further details 


1 ength 
can be 


In the next chapter, we take a look at the guided scrambling 
concept and see how it Incorporates the features ofthe coding. 
Finally, we end up with the conclusions on the presentwork with a 
note on the suggestions for continuation of work in this area. 
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PRINCIPLE OF GUIDED SCRAMBLING 


Scrambling is a process of randomizing a sequence. This 
process has the capability to provide balanced sequence and 
adequate timing information. But, in a worst case, it can also 
provide least timing information and large run lengths. So, in 
general, scrambling can produce a sequence which may or may not 
satisfy line code requirements. So, for line coding purposes we 
cannot rely totaly on scrambling. Guided scrambling is another 
technique which involves scrambling but provides an alternative 
for a sequence for transmission with best line code features. 

In this chapter, we briefly discuss the principle of guided 
scrambling [li- 
lt has been observed that when a sequence is augmented by 
appending ro bits (m 5: 1) and then scrambled, for at least one 
combination of the m bits, a scrambled sequence, which satisfies 
the line code requirements, is produced. This is the basis for 
guided scrambling. The principle of guided scrambling looks as 
shown in Fig. 5.1a. Implementation of a guided scrambling scheme 
is shown in Fig. 5.1b. 

The encoder views the some stream as a series of words k 
bits in length. Each word is augmented by m bits, resulting in an 
augmented word of length n = k+m. There are 2® augmenting bit 
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patterns . The resulting augmented bit streams are simultaneously 

d in 

multiplied by x and divided by d(x) to form 2 quotients. The 
quotient with best line code characteristic is chosen for 
transmission. Decoding is done through multiplication by d(x) and 
division by x**. The augmenting bits are placed in the most 
significant position. The scrambling polynomial must be properly 
chosen to ensure that there is at least one quotient with good 
line code characteristics in every selection set. 

The presence of complementary quotient is necessary for 
balanced transmission. Maximization of the complementary 
quotients allows for optimisation of secondary line code 
characteristics. A quotient selection set which consists of 2° 
quotients generated by d(x) = x m +l will contain the maximum of 

d”' -j^ 

2 pairs of complementary quotients while an appropriate 
scrambling polynomial ensures a choice for quotient selection, a 
guided scrambling encoder must select the appropriate quotient to 
realize the desired output characteristics. 

The update of encoding registers can be done by block or 
continuous or combined block/cont muous coding. In block coding, 
the shift registers are cleared prior to each source word pre- 
multiplication by x and division by d(x). In continuous coding 
the registers are updated following quotient selection to contain 
the remainder associated with selected quotient. In combined 
coding the registers are updated, rather than cleared following 
quotient selection. In this process, the first d bits of the 
incoming next augmented word undergo mod2 addition with contents 















of register. Uhen d ^ m, only augmentation bit values are 
affected. Decoding through continuous or block multiplication 
yields same bit stream following removal of augmenting bits. 

Description of the diagrams follows t 

The message stream is made into blocks of k bits each and to 
each block, m bits are appended. This takes place in the block 
"Augment”. Scrambling of this augmented bit stream takes place m 
the block scramble. The scrambled bit stream used for 
transmission is monitored in the block 'monitor'. Descrambling o t 
the received sequence takes place in the block ’ descrambl e ’ . The 
augmented bits are removed in the block ’expurgate’. 

A more detailed diagram of GS scheme is shown in Flg.(b). 
Since there are m augmenting bits, the message stream is 
augmented by 2 m patterns simultaneously in the blocks 0 f l,..2 m -l. 
All the 2 m augmented patterns are scrambled m the blocks 
x d /d(x). The scrambled versions which best meets the line code 
requirements is selected in the block 'select'. At the receiving 
end, the sequence is descrambled in d(x)/x d block and the 
augmenting bits are removed to retrieve the message sequence. 
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S. 1 CONCLUSION 

A set o£ desirable features which the linecode must satisfy 
in order to provide reliable communication over fibre optic 
channels was mentioned in the beginning . A detailed study, 
s bar t ing with codes which do not provide error correction and 
those which have error correcting capability has been done. 
Primary efforts have been to keep the redundancy as low as 
possible so that the efficiency of the code is good and the 
increase in data rate is low. 

One important observation is that we have predominantly used 
only block codes irrespective of whether they are linear or 
nonlinear, systematic or non-syst emat 1 c . The reason for this is 
that, the question is not whether we use a block code or a cyclic 
code, but the important consideration is that which type of code 
best meets the line code requirements. Ue found that block codes 
are sufficient to meet the requirements and also they are simple 
to encode and decode. Implementation of these codes also does not 
pose much difficulties. 

Since scrambling is another technique which is also used 
frequently in digital transmission systems we studied a 
modification of the same to obtain dc free block codes as 
presented in chapter five. 



In chapter two, we dealt with balanced codes, of both Error 
detecting and Correcting types. Since, in telecommunication 
systems, normally 8 bit PCM words are used, these codes having a 
typical code length of 16 may fit in well. The coset codes 
presented were derived from partitioning linear block codes. 
These codes can be easily implemented using ROM tables. 


In chapter three, we have proposed some modified block codes 
which are more efficient than these discussed in chapter two. Ue 
calculated the Run length, DSV, RDS etc of these codes, which 
showed a distinct improvement. A quick comparison of these codes 
with the dc free coset codes given in chapter two, shows that, 
for codes without error correction, the maximum run length is n 


where as for the DC free coset codes, it is 2n + 


n 


B3- 


The RDS 


max 


for these codes Is where as for the dc free codes, it is 


n+ 




A comparison of DC free codes with Error Correction shows 


that, for these codes, the maximum Run length is n and for DC 

w 


free codes, it is 2w + 

max 


f max 1 

Lj 2 J 


where, w is the maximum 

max 


weight of a sub-block g T , T g =1 where n is the code length 

J j n 

and 1^ is a all one vector of length n After designing balanced 
error correcting code, we designed burst correcting balanced 
array codes. The idea for this design is derived from product 
codes. Ue discussed the difference between the product codes 
and these array codes. One difference is that we transmit the 
array codeword only row wise. The column codes are not EC 
codes. The row codes are balanced codes. We discussed the error 
correction algorithm for the array codes. 
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In chapter four, we discussed the combined 8B1QB code, 
designed by combining 3B4B and 5B6B codes. This was followed by a 
directly coded 8B10B code Ue made a comparison between the two 
codes above. For combined code, we need three different encoding 
tables. But, they are smaller in size for the direct code. Ue 
need a single table with a size at least 256x10 bits. Later we 
made a comparison of details like run length, RDS, etc. of the 
codes . 

The maximum Run length for direct code is 6 where as that 

for combined code is 5. The RDS for both the codes is 3. The 

max 

DSV for both the code is 6. The URDS for boththe codes are 0 or 

±2. Both the codes are dc free. Ue concluded the chapter with a 

discussion on the efficiency of dc constrained codes. 

6.2 SUGGEST! OHS FOR FUTURE WORK 

Ue have laid down a theoretical basis fot the design of dc 
free codes. The point of interest is to make an attempt to 
implement them by hardware means and examine whether or not there 
are efficient in terms of memory space and speed. Since we have 
not worked out the power spectral densities of these codes, it 
would be worthwhile to work out PSd’s and try to obtain some 

information from these details. As mentioned before, we have used 

only block codes since they are easy to design and meet the 

requirements. It might be possible to achieve similar results by 

using the cyclic code approach. One may also take up the 
convolutional code approach and calculate similar details and 
make a comparison with the block codes described here, in terms 
of hardware complexity, ease of Implementation, etc. 



APPENDIX 


DEFINITIONS 


(1) Weight : The number o£ 1 ' a in. a codeword. 

(2) Hamming distance . The weight of the two codewords x+y 
where addition is mod-2. 

(3) Minimum distance : It is the minimum of the distances 
between all pairs of codewords. It is represented by <* min I* 
has to be at least of in order to detect any error pattern 
of weight d-1 or less. 

(4) A code can detect and correct all patterns of t or fewer 

errors if and only if the code has d > 2t+l. 

min 

(5) Disparity : It is defined as the difference between the 

n 

number of l’s and 0 ’ s in a codeword D — E a. . 

i = l 

(6) Running Digital Sum (RDS) : The RDS at any instant is 
defined as the accumulated difference between the number of 
transmitted, l’s and 0’s in a digital stream. 

(7) Word and Running Digital Sum (URDS) . It is the RDS measured 


at the end of a codeword. 
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